Picture for Changhao Pan

Changhao Pan

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer

Add code
May 29, 2026
Viaarxiv icon

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

Add code
May 29, 2026
Viaarxiv icon

Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios

Add code
May 27, 2026
Viaarxiv icon

TMD-Bench: A Multi-Level Evaluation Paradigm for Music-Dance Co-Generation

Add code
May 03, 2026
Viaarxiv icon

Diffusion Model as a Generalist Segmentation Learner

Add code
Apr 27, 2026
Viaarxiv icon

ImVideoEdit: Image-learning Video Editing via 2D Spatial Difference Attention Blocks

Add code
Apr 09, 2026
Viaarxiv icon

Modeling and Benchmarking Spoken Dialogue Rewards with Modality and Colloquialness

Add code
Mar 16, 2026
Viaarxiv icon

Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches

Add code
Jan 20, 2026
Viaarxiv icon

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation

Add code
Jul 09, 2025
Viaarxiv icon

Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis

Add code
Jul 08, 2025
Viaarxiv icon